Goto

Collaborating Authors

 plain english


Jailbreaking Large Language Models in Infinitely Many Ways

Goldstein, Oliver, La Malfa, Emanuele, Drinkall, Felix, Marro, Samuele, Wooldridge, Michael

arXiv.org Artificial Intelligence

We discuss the "Infinitely Many Meanings" attacks (IMM), a category of jailbreaks that leverages the increasing capabilities of a model to handle paraphrases and encoded communications to bypass their defensive mechanisms. IMMs' viability pairs and grows with a model's capabilities to handle and bind the semantics of simple mappings between tokens and work extremely well in practice, posing a concrete threat to the users of the most powerful LLMs in commerce. We show how one can bypass the safeguards of the most powerful open- and closed-source LLMs and generate content that explicitly violates their safety policies. One can protect against IMMs by improving the guardrails and making them scale with the LLMs' capabilities. For two categories of attacks that are straightforward to implement, i.e., bijection and encoding, we discuss two defensive strategies, one in token and the other in embedding space. We conclude with some research questions we believe should be prioritised to enhance the defensive mechanisms of LLMs and our understanding of their safety.


Can We Further Elicit Reasoning in LLMs? Critic-Guided Planning with Retrieval-Augmentation for Solving Challenging Tasks

Li, Xingxuan, Xu, Weiwen, Zhao, Ruochen, Jiao, Fangkai, Joty, Shafiq, Bing, Lidong

arXiv.org Artificial Intelligence

State-of-the-art large language models (LLMs) exhibit impressive problemsolving capabilities but may struggle with complex reasoning and factual correctness. Existing methods harness the strengths of chain-of-thought (CoT) and retrieval-augmented generation (RAG) to decompose a complex problem into simpler steps and apply retrieval to improve factual correctness. These methods work well on straightforward reasoning tasks but often falter on challenging tasks such as competitive programming and mathematics, due to frequent reasoning errors and irrelevant knowledge retrieval. To address this, we introduce Critic-guided planning with Retrieval-augmentation, CR-Planner, a novel framework that leverages fine-tuned critic models to guide both reasoning and retrieval processes through planning. CR-Planner solves a problem by iteratively selecting and executing sub-goals. Initially, it identifies the most promising sub-goal from reasoning, query generation, and retrieval, guided by rewards given by a critic model named sub-goal critic. It then executes this sub-goal through sampling and selecting the optimal output based on evaluations from another critic model named execution critic. This iterative process, informed by retrieved information and critic models, enables CR-Planner to effectively navigate the solution space towards the final answer. We employ Monte Carlo Tree Search (MCTS) to collect the data for training the critic models, allowing for a systematic exploration of action sequences and their long-term impacts. Our experiments demonstrate that CR-Planner significantly outperforms baselines, highlighting its effectiveness in addressing challenging problems by improving both reasoning and retrieval. Existing approaches (Yao et al., 2023b; Zhao et al., 2023b; Li et al., 2024) seek to harness the strengths of both chain-ofthought (CoT) reasoning (Wei et al., 2022) and retrieval-augmented generation (RAG) (Lewis et al., 2020) on knowledge-intensive complex reasoning problems.


Can Language Models Explain Their Own Classification Behavior?

Sherburn, Dane, Chughtai, Bilal, Evans, Owain

arXiv.org Artificial Intelligence

Large language models (LLMs) perform well at a myriad of tasks, but explaining the processes behind this performance is a challenge. This paper investigates whether LLMs can give faithful high-level explanations of their own internal processes. To explore this, we introduce a dataset, ArticulateRules, of few-shot text-based classification tasks generated by simple rules. Each rule is associated with a simple natural-language explanation. We test whether models that have learned to classify inputs competently (both in- and out-of-distribution) are able to articulate freeform natural language explanations that match their classification behavior. Our dataset can be used for both in-context and finetuning evaluations. We evaluate a range of LLMs, demonstrating that articulation accuracy varies considerably between models, with a particularly sharp increase from GPT-3 to GPT-4. We then investigate whether we can improve GPT-3's articulation accuracy through a range of methods. GPT-3 completely fails to articulate 7/10 rules in our test, even after additional finetuning on correct explanations. We release our dataset, ArticulateRules, which can be used to test self-explanation for LLMs trained either in-context or by finetuning.


How to explain Robotic Process Automation (RPA) in plain English

#artificialintelligence

If "machine learning" sounds like the beginning of a bleak dystopian future – think The Terminator mixed with The Matrix – then "robotic process automation" must be the phase when the machines rise up to rule humankind with ruthless efficiency. Fortunately, robotic process automation (RPA) involves nothing of the sort, except perhaps for the efficiency part. There aren't really even any robots involved in this automation software. "Robotic process automation is not a physical [or] mechanical robot," says Chris Huff, chief strategy officer at Kofax. Get the free eBook: Managing IT with Automation.


Microsoft to make coding 'in plain English' easier with PowerFx and GPT-3 AI model

#artificialintelligence

Microsoft is integrating AI technologies with its PowerFx low-code programming language. This integration will enable customers to use natural-language input and "programming by example" techniques when developing with PowerApps. Microsoft announced the coming new capabilities during the opening day, May 25, of its virtual Build 2021 developers conference. Officials said these new features will be in public preview in English throughout North America by the end of June. PowerFx is the low-code textual programming language Microsoft announced earlier this year.



Machine Learning Jargon in Plain English

#artificialintelligence

Supervised learning is, by far, the most researched approach. It is also divided into two major categories: Regression and Classification, with the data structure used to train these algorithms being the common denominator. You can think of them in the form of examples. With each of them, we get some features that pose a question, as well as the correct answer to this question, which we call the target. To better understand it, let us explore each category with an example.


OpenAI GPT-3 Past, Present and Future of AI and NLP

#artificialintelligence

Every one is talking about the mighty, great, futuristic language model by OpenAI, founded by Tesla CEO Elon Musk, Y Combinator partner Sam Altman and other Silicon Valley big shots like Google researchers and ex CTO of Stripe. It is truly eye opening. We also want to tell you how exciting it is. Why is GPT-3 so hyped right now? Probably because GPT-3 has the coolest video demos ever: based on just a few English sentences it can generate a TODO app (write code by itself), generate Excel spreadsheets, automatically translate, generate quizzes based on content. Every one is writing about GPT-3, but because we are technical, our article will give you the important technical details and background you need to understand OpenAI's GPT-3.


Could this artificial intelligence change programming as we know it?

#artificialintelligence

It's easy to imagine the artificial intelligence (AI) revolution as years off in the future, but it might be much closer than you think. When Sharif Shameem posted to Twitter an experiment he did with GPT-3, a closed-access artificial intelligence, thousands in the technology community were stunned. With GPT-3, I built a layout generator where you just describe any layout you want, and it generates the JSX code for you. With seemingly little effort, the two-minute clip appeared to show an AI understand how to write fairly complex computer code from a request in plain English, despite never having been trained to write code in the first place – or even understand English. 'It was never explicitly programmed how to read or how to understand English,' says Shameem, who has founded a start-up to help people develop web applications by writing in plain English.


How does Machine Learning impact the field of Law

#artificialintelligence

Today, we see how computer programs, algorithms, and robots replace simple human activities, but there is the technology that is at the forefront of the spectrum: AI. The consequences of artificial intelligence have such an impact that they incite us to wonder if we are experiencing the beginning of a new era. According to Gartner, business use of AI has grown by 270% in the last four years, and slightly more than a third of organizations have implemented AI in some way, according to their specific needs. But is this also a reality within the legal sector? AI has found its way to support attorneys and clients alike, and there is a clear growing interest in technology.